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1. INTRODUCTION 

In recent years, the environmental problems caused by climate change and traditional fossil fuels 
have become increasingly serious. Using sustainable renewable energy (RE) sources guards against 
environmental deterioration and shields the atmosphere from the hazards and disruptions of nuclear power 
[1], [2]. The fact that RE is powered by renewable resources such as the sun, wind, and water results in lower 
costs. Creating clean, green power also contributes to lowering pollutants and CO2 emissions. 

The fundamentals of electricity production are largely the same for all RE sources. Especially, wind 
power is the most widely used to generate electricity using wind turbines [3]. Wind energy is 
environmentally friendly as it can be used easily, so it is an ideal source of renewable energy on a large scale. 
But during its rapid development, it faced many challenges. Wind energy is affected by many effects such as 
wind speed, which in turn is changeable, inconstant, and intermittent because of the large fluctuations in 
wind power production. Due to this extreme variability, wind energy integration is facing significant 
challenges. The impact of the hazard can be minimized by the wind speed forecasting model [4], [5]. 

To predict wind energy, many strategies have been improved, which are divided into three 
categories: statistical and physical methods and artificial intelligence models [6]. Statistical approaches 
including automatic regression (AR) and auto-regressive integrated moving average (ARIMA) are best 
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implemented than digital weather models in short-term forecasting, so it is considered a modest method [7]. 
However, due to the linearity of statistical approaches, they cannot correctly predict nonlinear and 
nonstationary wind energy [8]. Physical prediction approaches like numerical weather prediction (NWP) 
methods, when the environment is constant, they show high precision in long-range forecasting [9], [10]. 
Nevertheless, the computational complexity of the accuracy of these models is greatly increased by the 
complex information requirements of the atmosphere [11]. Artificial intelligence approaches like least 
squares support vector machines (LSSVM) [12], artificial neural networks (ANN), and back-propagation 
algorithms (BP) are powerful and broadly applied to wind energy forecasting with suitable precision. The 
ANN is more favored due to its nonlinear system, which can take the fuzzy functional relationship of 
historical time series [13]. Also noted is that the strong properties of ANNs make them an effective tool for 
wind energy prediction. For instance, [14] based on the hourly average wind speed data, performed a 
comprehensive comparative study of the forecast performance of three different ANNs. Meng et al. [15] 
described that ANN has several advantages over other models no supplementary information is needed other 
than historical wind speed data. Moreover, Kani and Ardehali [16] proposed a new technique for predicting 
short-range wind speeds using ANN and Markov chains. Moreover, wavelet neural network (WNN) is an 
ideal prediction tool with advanced convergence speed and excellent results, so it is one of the most efficient 
artificial neural networks. That has been broadly used for time series forecasting in any domain such as wind 
power forecasting [17]. 

These suggested works are all built on supervised learning. On the contrary, they experience some 
anxiety. The main drawbacks of these models are the local minima they can reach and the slow convergence 
time. Also, they are reliant on the input data and perform poorly with large datasets or when the dataset has 
more noise. Ultimately, these models are incapable of adjusting to significant changes in meteorological data [18]. 

Taking into account these issues, this article proposes a novel model to evaluate the quantity of wind 
energy produced founded on the regularized extreme learning machine (R-ELM) algorithm. R-ELM is 
founded on the main minimization of structural risk and weighted least squares. It fixes the issues with the 
algorithms mentioned that are used to wind energy forecast. The implementation of the R-ELM algorithm 
generalization is significantly ameliorated in many instances without increasing the learning time [19]. Due 
to the concealed nodes’ connection weights being spread at random and never being updated. We take as 
inputs of the proposed wind energy forecasting model the previous wind speed which forms a time series. 
While the output is the next energy generated by the wind turbines. In the hidden layer, the number of nodes 
is a large hyperparameter that greatly affects the execution of the final output of the model. In this regard, the 
genetic algorithm (GA) is applied in this paper to improve the hidden neurons of the proposed R-ELM 
model. GA is a computation technique intended to optimize a problem by iteratively attempting to make the 
result better following a fitness function [20]-[26]. It has shown a strong capability in the optimization field 
compared to optimization techniques such as particle swarm optimization (PSO) [27] and ant colony (AC) [28]. 
The suggested model is called regularized extreme learning machine algorithm genetic algorithm (R-ELM-GA). 

The remainder of this article is organized in the following way. In heading 2 we introduce the ELM 
algorithm and R-ELM algorithm, and next, we expose the GA, and we provide an elaborate description of the 
suggested wind energy forecast founded on R-ELM-GA. In section 3 we present the results of the simulation. 
We conclude the paper in section 4. 


2. METHOD 
2.1. Extreme learning machine 

ELM which is introduced in [29] is a forward neural network with a powerful hidden layer that is 
detected by feedback. The main components of the ELM construction are the input layer, the output layer, 
and the hidden layer, connected by links called weights. The initial input weights are chosen at random, and 
the output weights are established using the inverse Moore-Penrose function [30]. ELM outperforms other 
machine learning techniques when the computational value is low. 

Taking into account a single-hidden layer feedforward neural network (SLEN), we will suppose that 
it has a training set {(x;,t;)}/L, with N separated instances, where x; = [ Xi; Xiz; «----3Xin|’ include n 
inputs and ti = [tj1;tj2}....;tim]’ includes m outputs, and g(x) the function that activates the hidden 
layer’s output, then the generic result t; maybe placed into the outcome target following the subsequent 
function: 


Ej- biglo xi + bj) (1) 


Where wj; and b; are the randomly attributed parameters; p is the weight of the linking between the hidden 
nodes and the output nodes illustrated in Figure 1. 
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n Input Layer L Hidden Layer m Output Layer 


Figure 1. Representation of the ELM construct 


There are two key stages to the ELM learning phase. As previously mentioned, the hidden layer’s 
weights and biases are first created at random. The following equation may then be used to calculate an exact 
approximation of the input samples: 


Hp =T (2) 
g(wx + bı) = g(@ix1 + by 
where H = : i : (3) 
gCwıXy + bı) © g@ixy + bid yy, 
and T = [t;; tz; ...; ty]” . In the second stage, making use of the generalized Moore-Penrose inverse of the 


hidden layer matrix Ht, the output weights are computed: 
B= HİT (4) 
where the Moore-Penrose pseudo-inverse of H is denoted by HÌ. 


2.2. Regularized extreme learning machine 

Recently, ELM has gained great celebrity, and then due to its speed and good generalization 
execution, it has been successfully applied in various fields. However, it can even be examined as an 
empirical subject of risk reduction and tends to create an overfitting model [21]. Furthermore, it can train less 
reliable estimates, especially with the existence of heterogeneous values or events in the data. Finally, ELM 
can provide less control because it directly computes the least-squares solution of the weakest criterion [31] 
to fill these gaps, Deng et al. [31] suggested a new algorithm named R-ELM founded on the principle of 
structural risk minimization (SRM) and the weighted least squares method. In general, when you want to 
configure SLFN, you have to find w;, bj, B (i = 1... L) like this: 


min |lel|* 

s.t Vio pjg (Œj. xi + bj) — ti = £i (5) 

i = 1,..,N 
Where £; = £j1, €j2, «++, Eim 18 the evaluator among the actual value and objective value of the i-th specimen, 
and € = £4, €2,.....,€,. However, a well-generalized model should achieve the best solution despite the 


empirical risks and structural risks, which in turn constitute the real danger of prediction by learning statistics 
theory. By presenting a weighting factor y for the empirical risk, which is depicted by the sum of the squares 
of the errors i.e., ||e||?, their proportions can be regularized, and the structural risk can be depicted by || ||? 
which is a value to maximize the distance to the edge disconnecting between boundary categories. 
In addition, to obtain a robust estimate that attenuates the anomalous interferences, the error £; is weighted by 
the variable v;. Thus, ||£||? is prolonged to ||D,||?, where D = diag(vj, V,...., Vy). 
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Therefore, we can be depicted the mathematical model of the proposed R-ELM algorithm as (6): 
1 1 
min z llel? +5 vlipa? 
s.t Dj- pjg lwj- xi + bj) — ti = £i (6) 
i = 1,2,..,N 
To determine the optimal balance between the ratio of structural risk and empirical risk, any 


individual can adopt this ratio by adjusting these two risks, resulting in a model with good generalization 
performance. We can be depicted the Lagrangian function of (6) as (7): 


1 2 1 2 ` 
L = Bea) = 3 yiiDell? +3181 - È. a; 
i=1 
(Zhao Big (wj. xi + By) — ti — £i) = 5 VIIDell? + SIIBI? — a(HB -T - £) (7) 


where a; E R(i=1,...,N) is the Lagrange multiplier with equality restrictions from (6) and 
a = [43 Q;...; Ay]. Then, by setting the gradient of this Lagrangian to zero for (p, €, a), the optimal 
conditions are obtained as (8): 


aL T_ 

T P =aT, 

— > ye™D?+a=0, (8) 
aa HB -T-e=0. 


Replacing the latter formula of (8) in the second formula will result in a clear formula for a (9) and £; can be 
computed with «æ (10): 


a =—y(HB - T)" (9) 


a=% (10) 


By resolving (8), we can get the solution of £: 
t 
B= (+H D? H) HTD?T (11) 
4 
where J is a unitary matrix. When D is the unitary matrix J, we can use the following expression to calculate p: 
t 
BN AA T T 
B=(5+H H) HT (12) 
The algorithm is named unweighted regularized ELM (UWR-ELM) in this case. Indeed, when y — 00 
the traditional ELM is a special case of the ELM-UWR. There are many types of calculation methods to 


obtain the weights v;, such as (13): 


Ei 
1 |z| < Cy 


v; = aly QS ise (13) 


w» | 


C1—C2 
1074 otherwise 


where the constant c, and c, are usually set at 2.5 and 3 consecutively. We can compute § which is a robust 
estimate of the standard deviation of unweighted error variables £; is as (14): 


IQR 
2x0,6745 


§ = (14) 


where IQR is the interquartile interval that is the variance between the 25" percentile and the 75" percentile. 
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2.3. Genetic algorithm 

The genetic algorithm is a Darwinian developmental simulation of natural selection and the process 
computational model of the genetic mechanism of biological evolution proposed by [22]. It is an extensive 
simulation of the natural evolution method of optimal solutions to some complex problems. At the core of the 
genetic algorithm are initial group identification, parameter encoding, genetic manipulation, fitness function, 
and control parameters [23]. The genetic process principally contains three factors: the selection process, the 
crossing process, and the variation process. The control parameters primarily contain group size and the 
probability of genetic functioning. 


2.4. Proposed method 

One of the most important and best sources of renewable energy is wind energy, which is 
considered appropriate, promising, and active, attracting more and more global attention, thanks to its 
ambitious advantages such as environmental protection and ease of use. Wind turbines generate wind energy 
by turning the kinetic energy of the wind into usable power. The power delivered by this generator can be 
calculated as (15) [32]: 


0 SiV < Vpin 
Py =4aV? +bV +c siVmin SV < V (15) 
Pwy si V; < V > Vnax 


where: 

Pyn: the estimated produced energy of the wind turbine 
V: wind speed 

Vmin: the wind speed required to trigger the wind turbine 
Vnax: the cut-off wind speed of the wind turbine 

V,: estimated wind speed 

a, b, and c: constants and depend on the wind turbine type 

The main drawback of wind power is the great variability of wind speed that makes it difficult to 
control and optimize the operation of power production. It leads to significant challenges in the planning of 
reliable wind power systems and also affects their rapid development. Due to this extreme variability, the 
integration of wind power inside the grid faces significant challenges. Consequently, the effects of the 
fluctuating wind speed can be minimized by building a prediction model to evaluate the wind power 
produced. To determine the power output generated, meteorological measurements like wind speed are taken 
as inputs because the efficiency of the generation unit depends on the weather conditions. 

In the literature, there are several works on the prediction of wind energy generation. However, they 
suffer from some difficulties such as local minimum and slow convergence time. R-ELM is a strong 
algorithm that has proved its capability to solve the shortcomings. 

In this regard, in this article, we propose a model for forecasting wind energy that combines the 
R-ELM algorithm and GA. The proposed model exploits the most advantages of the R-ELM algorithm 
(extreme time convergence and good generalization ability) while optimizing the hyperparameter of hidden 
nodes number using the GA. Since the past wind speed restrains hidden information and correlation that 
affect the next wind power generated, then we built a prediction model with wind speed values as inputs of 
the network. The final production corresponds to the hourly wind energy produced. According to the work of 
[19], to correlate past wind speed values with the next wind speed value, only 8 past wind speed values are 
sufficient, so we opt for 8 entries for the last wind speed. The developed model, designated as R-ELM-GA, 
can make use of the main features of the R-ELM technique while eliminating the random selection of the 
hidden node number or the recurrent tests that need more training time and result in slower convergence. 
Following [33], we can determine the hidden nodes’ L number in the hidden layer as (16): 


L=yn+m+a (16) 


where @ is a constant and 1 < æ < 10. 

The number of inputs in our model is eight and the number of the output is one. The hidden nodes 
number L might thus range from 4 and 13 according to (16). The set {4, ...,13} is seen as a population of 
unique solutions for the GA, where the mean square error (MSE) is regarded as fitness or an objective 
function. The many phases of the R-ELM-GA method are presented in a flowchart in Figure 2. 
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Part 1 Part 2 


Initialize population L into {4,...,13 


Initialize the weight w and 
bias b of R-ELM 


Optimum L 
Predict the wind power using 
the optimal R-ELM 


R-ELM error for training set is 


served as fitness function 


Selection operation 


Mutation operation 


Satisfying conditions 


Figure 2. The different steps of the R-ELM-GA algorithm 


3. RESULTS AND DISCUSSION 

In this section, we will introduce the results and discuss the numerical analyses. To evaluate the 
effectiveness of the suggested wind prediction R-ELM-GA model, we used a set of wind speeds from 
Tetouan City in Morocco [30]. The dataset for the modeling has been divided into training and testing sets. 
As a result, we used 70% of the instances in the training set to train the model and 30% of the examples in 
the test set to evaluate its performance. 

The Python language is used to implement the R-ELM-GA method. There are one output node and 
eight input nodes in the whole network. The GA approach was used to optimize the number of concealed 
nodes into the set of {4, ...,.13}. As a consequence, the optimization procedure determined L be optimum at 
12. To enhance and measure the forecasting performance of the model, we conducted a comparison etude 
using the most used algorithm in wind energy forecasting, namely the R-ELM [34], the fundamental ELM 
[30], the BP [14], and the support vector machines (SVM) [35] algorithms. 

To study the efficacy of the suggested model and its ability to better perform in the critical season, we 
have added an examination of contrasts for the summer and winter months, respectively. The forecast results 
are presented in Figures 3-7 where we compared the results of the BP, SVM, ELM, R-ELM, and R-ELM-GA 
models respectively, and the examined measures for one month in summer. In Figures 8-12, we have exposed 
the comparison result of the BP, SVM, ELM, R-ELM, and R-ELM-GA models respectively, and the examined 
measures for one month in winter. Based on these figures, the R-ELM-GA model has good prediction 
measures in both the summer and winter seasons, as its curve closely resembles the observed curve. 
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Figure 3. Wind power predicted by BP in summer Figure 4. Wind power predicted by SVM in summer 
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Figure 5. Wind power predicted by ELM in summer 


Wind power (Watt) 


0 100 200 300 400 


Time(h) 


500 600 700 


Figure 7. Wind power predicted by R-ELM-GA in summer 
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Figure 9. Wind power predicted by SVM in winter 
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Figure 11. Wind power predicted R-ELM in winter 
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Figure 6. Wind power predicted by R-ELM in summer 
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Figure 8. Wind power predicted by BP in winter 
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Figure 10. Wind power predicted by ELM in winter 
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Figure 12. Wind power predicted by R-ELM-GA in winter 
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The five models’ combined predictions for the summer and winter seasons are shown in Tables | and 2, 
respectively. The tables provide the time convergence of each technique in seconds and the MSE score, 
which may be determined as (17): 


1 
MSE =~ Xizi — 9%) (17) 


where the number of instances in the test set is n, the predictive output is O;, and the measured output is y;. 

The results shown in Tables 1 and 2 highlighted the execution of the suggested R-ELM-GA method 
concerning the MSE error. In comparison to the other models, BP and SVM provided the largest MSE since 
they provided a considerable discrepancy between the test values and the forecast’s outcome. It is about 
0.8579 for BP and 0.7256 for SVM in the summer season. In contrast, MSE’s ELM and R-ELM are about 
0.1827 and 0.1853 respectively, which are even larger than the MSE’s R-ELM-GA is about 0.1137. 

For the winter season, the MSE is about 0.9251 for BP and about 0.8689 for SVM. Furthermore, the 
error of ELM and R-ELM are about 0.3486 and 0.1932 respectively. All these models produce larger errors 
than that of R-ELM-GA which is about 0.1573. 

In addition, the biggest time convergence value is reached by BP and SVM which calls for a lengthy 
computing process with several iterations. On the other hand, ELM, R-ELM, and the proposed model give 
the lowest values. Although the proposed model uses GA to reduce the number of hidden nodes, it still gives 
a smaller convergence time. Given that the suggested R-ELM-GA approach offered comparably better 
forecasts than static models and required faster convergence, all these findings illustrated the adaptability of 
the method. 


Table 1. Comparison of different models in the Table 2. Comparison of different models in the 
summer winter 
Prediction method MSE Time convergence(s) Prediction method | MSE Time convergence(s) 
R-ELM-GA 0.1137 0.8153 R-ELM-GA 0.1573 0.8582 
R-ELM 0.1853 0.7283 R-ELM 0.1932 0.7591 
ELM 0.2597 0.6324 ELM 0.3486 0.6816 
SVM 0.7256 4.1527 SVM 0.8689 5.9872 
BP 0.8759 5.1725 BP 0.9251 6.2853 


4. CONCLUSION 

The operational safety of the electricity grid requires the need to provide wind energy as the most 
used RE source, but this is still very difficult due to the instability of wind speeds and their severe 
interruptions. That is why forecasting wind speed has become essential for the effective utilization of energy. 
In this regard, we have provided this article with a power forecasting model for the generation of wind 
energy using the R-ELM and GA, the so-called R-ELM-GA based on past wind speed values, the suggested 
model tried to forecast the next wind energy produced through the wind turbines. The GA was employed to 
choose the ideal network design, that is, the ideal number of hidden nodes, in the initial network of the R-ELM 
model. The simulation results highlighted the execution of the suggested R-ELM-GA algorithm as it has 
produced relatively better predictions than the other compared algorithms. It is a very fast, powerful, and 
active learning algorithm. As a result, we can deduce that the R-ELM-GA can be utilized very flexibly in the 
domain of wind power forecasting. In future works, we will focus on enhancing the proposed R-ELM-GA 
model by employing a twofold optimization strategy to select the number of hidden nodes in the complete 
model and the regularization parameter of R-ELM most advantageously. 
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